AITopics | length 1

Collaborating Authors

length 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Approximation of the Proximal Operator of the $\ell_\infty$ Norm Using a Neural Network

Linehan, Kathryn, Balan, Radu

arXiv.org Artificial IntelligenceAug-20-2024

Computing the proximal operator of the $\ell_\infty$ norm, $\textbf{prox}_{\alpha ||\cdot||_\infty}(\mathbf{x})$, generally requires a sort of the input data, or at least a partial sort similar to quicksort. In order to avoid using a sort, we present an $O(m)$ approximation of $\textbf{prox}_{\alpha ||\cdot||_\infty}(\mathbf{x})$ using a neural network. A novel aspect of the network is that it is able to accept vectors of varying lengths due to a feature selection process that uses moments of the input data. We present results on the accuracy of the approximation, feature importance, and computational efficiency of the approach. We show that the network outperforms a "vanilla neural network" that does not use feature selection. We also present an algorithm with corresponding theory to calculate $\textbf{prox}_{\alpha ||\cdot||_\infty}(\mathbf{x})$ exactly, relate it to the Moreau decomposition, and compare its computational efficiency to that of the approximation.

length 1, vector, vector data, (15 more...)

arXiv.org Artificial Intelligence

2408.11211

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Are emergent abilities of large language models a mirage? – Interview with Brando Miranda

AIHubApr-25-2024, 09:10:42 GMT

Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo won a NeurIPS 2023 outstanding paper award for their work Are Emergent Abilities of Large Language Models a Mirage?. In their paper, they present an alternative explanation for emergent abilities in large language models. We spoke to Brando about this work, their alternative theory, and what inspired it. This is a good and hard question to answer cleanly because the word emergence has been around in science for a while. For example, in physics, when you reach a certain number of uranium atoms you can make a bomb, but with fewer than that you can't.

emergent ability, language model, metric, (17 more...)

AIHub

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)

Add feedback

Making use of supercomputers in financial machine learning

Cotte, Philippe, Lagier, Pierre, Margot, Vincent, Geissler, Christophe

arXiv.org Machine LearningMar-1-2022

This article is the result of a collaboration between Fujitsu and Advestis. This collaboration aims at refactoring and running an algorithm based on systematic exploration producing investment recommendations on a high-performance computer of the Fugaku type [11], to see whether a very high number of cores could allow for a deeper exploration of the data compared to a cloud machine, hopefully resulting in better predictions. We found that an increase in the number of explored rules results in a net increase in the predictive performance of the final ruleset. Also, in the particular case of this study, we found that using more than around 40 cores does not bring a significant computation time gain. However, the origin of this limitation is explained by a threshold-based search heuristic used to prune the search space. We have evidence that for similar data sets with less restrictive thresholds, the number of cores actually used could very well be much higher, allowing parallelization to have a much greater effect.

algorithm, length 1, ruleset, (16 more...)

arXiv.org Machine Learning

2203.00427

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback